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A method of detecting and correcting an error in a 
string of information signals. When each information 
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signal represents a word, the method detects and cor- 
rects spelling errors. The method detects and corrects 
an error which is a properly spelled word, but which is 
the wrong (not intended) word. For example, the 
method is capable of detecting and correcting a mis- 
spelling of "HORSE" as "HOUSE". In the spelling 
error detection and correction method, a first word in 
an input string of words is changed to form a second 
word different from a first word to form a candidate 
string of words. The spellings of the first word and the 
second word are in the spelling dictionary. The proba- 
bility of occurrence of the input string of words is com- 
pared to the product of the probability of occurrence of 
the candidate string of words multiplied by the proba- 
bility of misrepresenting the candidate string of words 
as the input string of words. If the former is greater than 
or equal to the latter, no correction is made. If the for- 
mer is less than the latter, the candidate string of words 
is selected as a spelling correction. 

36 Claims, 5 Drawing Sheets 
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METHOD AND APPARATUS FOR "WRONG 
WORD" SPELLING ERROR DETECTION AND 
CORRECTION 

5 

BACKGROUND OF THE INVENTION 

The invention relates to methods and apparatus for 
detecting and correcting errors in information signals. 
More specifically, the invention relates to the detection 
and correction of spelling errors. 10 

In text processing apparatus, such as dedicated word 
processors or word processing programs which are run 
on general purpose digital computers, it is desirable to 
provide automatic detection and correction of spelling 
errors. Most spelling error detection apparatus and 15 
programs check each word in a text against the entries 
in a spelling dictionary. Words in the text which are not 
found in the spelling dictionary are assumed to be mis- 
spelled. The misspelled words are identified to the text 
processing operator by, for example, highlighting the 20 
word on a display device. Sometimes candidate words 
having spellings similar to the misspelled word are also 
displayed to the operator as proposed corrections. 

The known apparatus and methods for detecting and 
correcting spelling errors have several deficiencies. 25 
Most importantly, the known apparatus and methods 
cannot detect a "wrong word" erroneous spelling 
(where the erroneous spelling is itself a word in the 
spelling dictionary but is not the word that was in- 
tended). 30 

Moreover; even where the erroneous spelling does 
not appear in the spelling dictionary, the prior apparatus 
and methods provide no means or only limited means 
for ranking alternative candidates for the correct spel- 
ling. 35 

SUMMARY OF THE INVENTION 

It is an object of the invention to provide a method 
and apparatus for detecting and correcting an error in 
an information signal, where the information signal 40 
represents the wrong information. When the informa- 
tion signal represents a word, the invention provides a 
method and apparatus for detecting and correcting 
spelling errors, where erroneously spelled words are 
correct entries in the spelling dictionary, but are not the 45 
intended words. 

It is another object of the invention to provide a 
method and apparatus for estimating the probability of 
occurrence of a word whose spelling is being checked, 
and to estimate the probabilities of one or more altema- 50 
live words as candidates for replacing the word being 
checked. 

In a spelling error detection and correction method 
according to the present invention, an input string of 
words T,= W/is provided. The spelling of a first word 55 
Ti— Wj in the input string is changed to form a second 
word W2 different from the first word, to form a candi- 
date string of words W c . The probability P(W/) of oc- 
. currence of the input string of words and the probabil- 
ity P(W C ) of occurrence of the candidate string of 60 
words are estimated. The probability P(T,|W C ) of mis- 
representing the candidate string of words W f as the 
input string of words T; is also estimated. Thereafter, 
P(W,) is compared with the product P(W C )P(T/| W c ). A 
first output is produced if P(W/) is greater than 65 
P(W C )P(T/| W c ), otherwise a second output is produced. 

In one aspect of the invention, the first output com- 
prises the input string of words. The second output, 



2 

comprises the candidate string of words. Alternatively, 
the second output may be an error indication. 

The probability P(T,|W f ) of misrepresenting the 
candidate string of words as the input string of words 
may be estimated as the probability P(Ti | W2) of mis- 
spelling the second word Wj as the first word T\. 

In the spelling error detection and correction method 
and apparatus according to the invention, each word in 
the input string and each word in the candidate string is 
a member of a set of correctly spelled words. 

Preferably, the method and apparatus according to 
the invention further comprise the step of estimating the 
probability P(T,|W/) of correctly spelling all of the 
words in the input string of words W,. In this case, the 
product P(W,)P(T/|W/) is compared with the product 
P(W C )P(T/| W c ). The first output is produced if P(W/)P- 
(T/|Wf) is greater than P(W C )P(T,|W C ), otherwise the 
second output is produced. 

The probability P(T/|W/) of correctly spelling all of 
the words in the input string may be estimated as the 
probability P(Ti|Wi) of correctly spelling the first 
word Wj. 

According to an embodiment of the invention, the 
spelling of the first word Ti may be changed to form the 
second word W2 by adding, deleting, transposing, or 
replacing one or more letters in the first word to form a 
tentative word. The tentative word is compared to each 
word in the set of words. The tentative word is used as 
the second word W2 if the tentative word matches a 
word in the set of correctly spelled words. 

Alternatively, the spelling of the first word may be 
changed to form a second word by identifying a confu- 
sion group of M different words in the set of correctly 
spelled words. Each word in the confusion group may, 
for example, have a spelling which differs from the first 
word by no more than two letters. Alternatively, each 
word in the confusion group may be one which is con- 
fusable with the first word. At least one word in the 
confusion group is selected as the second word W2. 

Satisfactory results have been obtained in the method 
and apparatus according to the invention by estimating 
the probability of correctly spelling a word as 0.999. 
The probability of misspelling a word may be estimated 
to be (0.001/M). 

The spelling error detection and correction method 
and apparatus according to the present invention are 
advantageous because by comparing the probability of 
occurrence of the word being checked and the probabil- 
ities of occurrence of one or more spelling correction 
candidates, it is possible to detect and correct errors 
which are correct spellings of the wrong word. 

BRIEF DESCRIPTION OF THE DRAWING 

FIG. 1 is a block diagram of an embodiment of the 
spelling error detection and correction method accord- 
ing to the present invention. 

FIG. 2 is a block diagram of an example of the spel- 
ling error detection and correction method of FIG. 1. 

FIG. 3 is a block diagram of another example of the 
spelling error detection and correction method of FIG. 
1. 

FIG. 4 is a block diagram of an embodiment of a 
routine for changing the spelling of a first word to form 
a second word in the spelling error detection and cor- 
rection method according to the present invention. 
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FIG. 5 is a block diagram of an alternative embodi- to be IxlO- 8 . While these probabilities are purely 

ment of a method of changing the spelling of a first hypothetical for the purpose of illustrating the opera- 

word to form a second word. • tion of the present invention, the hypothetical numbers 

FIG. 6 is a block diagram of a preferred modification illustrate that the probability of occurrence of "the 

of the spelling error detection and correction method 5 horse ran" is much greater than the probability of oc- 

shown in FIG. 1. currence of "the house ran". 

FIG. 7 is a block diagram of an embodiment of an Proceeding with the method, the probability 

apparatus for detecting and correcting an error in an P(T//W C ) of misrepresenting the candidate string of 

information signal. words ^ th e input string of words is estimated to be 

DESCRIPTION OF THE PREFERRED 10 e * ual t0 the Potability P(Ti/W 2 ) of misspelling the 

EMBODIMENTS second word W2 as the first word Ti- From experiment 

... it has been determined that an estimate of 0.001 pro- 

The invention is a method of detecting and correcting duces satisfactory results, 

an error in an information signal. In the case where each Finally, the value of P(W,) is compared to the prod- 

information signal represents a word which is a member 15 uct P(W C )P(T//W C ). Since the former (5x10-5) is 

of a set of correctly spelled words, the invention pro- grea ter than the latter (1X10-"), the input string of 

vides a method of spelling error detection and correc- words b determined to be correct, and the candidate 

r • ™~ - *. . string of words is rejected. Accordingly, the output is 

Refemng to FIG. 1, the spelling error detection and «th e norse 

correction method starts with the step of providing an 20 na 3 illustrates the operation of the spelling error 

input string of words T, =W,-. Each word in the input dcle ction and correction method where the input string 

string has ;a spelling. is house ^ Now ^ ^ WQrd T w h 

Next, the spelling of a first word T, = W 2 in the mput « house » ( ^ the word w is .. horse „ B ^ 

string is changed to form a second word W 2 different the p robabilit ies estimated in FIG. 2, the probabfl- 

from the first word, to form a candidate stnng of words 25 ity of the mput strin g (1 x 10 _ 8) ^ nQW |ess ^ ^ 

1 cTi- * .1. • . . » ... product of the probability of the candidate string multi- 

In FIG. 1, the input stnng and the candidate stnng hed b tne probability of m i srepr esenting the candi- 

each compnse three words. According to the invention date stri { ^ (5 x ^. 8) jf^^ ^ 

Sinn i S ™ y be My , ft input string is now rejected, and the candidate string is 

greater than or equal to two. Each stnng may be, for 30 detemined to be correct The output is set to "the horse 

example, a sentence or a phrase. r 



ran" 



^^n the T^a*"* TOw occurrencc of f * e The spelling error detection and correction method 

SjL S tr*? ^ ° occurrence of to accor<Jin P t0 ^ t is base< j on ^ fol . 

candidate stnng of words are estimated. These probab]]- ■ ^ »-. r . A A . - , 

« . » .. „ , *«M6F 4 ^» lowing theory. For each candidate stnne of words (for 

^3 t^So^S ItVlTT" 8 hrSe 35 example, for each candidate sentence) w' the probabU- 

Ikn tZ ^InT^^T- »y »l»t the candidate sentence was actually tatended 

Also estimated is the probability PfTi/Wr) of misrep- . _ A . * - - 1 * /- . . • * , \ 
resenting the candidate string of words W c as the inpm ? V " th t e ""Sentence (mput stnng of words) 
string of words T, The probability POV/Wc) may be T,=W ' was ty P ed » * ven * 

chosen empirically by selecting different values until 40 

satisfactory results are obtained, as discussed in the ^ _ fl y cW/l w c) 0) 

Examples below. * *t r '> 

After the required probabilities are estimated, the 
probability P(W,) is compared with the product of the 1x1 this equation, P(T//W C ) is the probability of mis- 
probabilities P(W f )P(T//W f ). If P(W,) is greater than or 45 representing the candidate String of words W c as the 
equal to the product P0V c )P(T,/W e ), then a first output ^put string of words T,= W,*. 

is produced. Otherwise, a second output is produced. The probability P(W//T/) that the original sentence 

As shown in FIG. 1, the first output may be the input was actually intended given that the original sentence 
string WiW^Wat. The second output may be the candi- was typed (that is, the probability of correctly spelling 
date string WYWjtfWjv. 50 all of the words in the original sentence W/) is compared 

Alternatively, the second output may be an error to P(W C /T/). For simplicity, both sides of the compari- 
indication. son are multiplied by P(T/) so that the product P(W/)P- 

Two examples of the spelling error detection and (T//W/) is compared with the product P(W C )P(T,/W C ). 
correction method according to the present invention The sentence with the higher probability is selected as 
are shown in FIGS. 2 and 3. Referring to FIG. 2, the 55 the sentence which was actually intended, 
input string is a string of three words: **the horse ran". In order to further simplify the comparison, it may be 
Each word in the input string of words is a member of assumed that the probability P(T/W/) of correctly spel- 
a set of correctly spelled words. The first word ling all of the words in the original typed sentence is 
Ti = W 2 is "horse". equal to 1. 

Next, the spelling of the first word "horse" is 60 The probabilities P(W/) of occurrence of the input 
changed to form the second word W2, "house". The string of words and P(W C ) of occurrence of the candi- 
candidate string of words W c is then "the house ran". date string of words may be approximated by the prod- 
The second word "house" is also a member of the set of uct of n-gram probabilities for all n-grams in each string, 
correctly spelled words. That is, the probability of a string of words may be 

Continuing, the probability P(W,) of occurrence of 65 approximated by the product of the conditional proba- 
the input string of words "the horse ran" is estimated to bilities of each word in the string, given the occurrence 
be 5 X 10- 5 . The probability P(W C ) of occurrence of the of the n — 1 words (or absence of words) preceding each 
candidate string of words "the house ran" is estimated word. For example, if n=3, each trigram probability 
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may represent the probability of occurrence of the third original typed sentence. Where the original typed sen- 
word in the Ingram, given the occurrence of the first tence and the candidate sentence differ by only one 

"^rS^iTtiv . _ . , word, the probability PfT.I W,) may be estimated as the 

The conditional probabdit.es may be determmed em- probability PCT,| W,) of correctly spelling the first 
pincally by examining large bodies of text. For exam- 5 wordTi=Wt »f«"»b 
pie, the conditional probability JCWWrWy) of word nr a .v „.,..■ v t. ■. j . 
W, given the occurrence of the strmg W X W, ma^ he Jit i Sh °" S a "J^T'. 7 ^ ""^ 

estimated from the equation ' clwnge he spelhng of the first word W, to the second 

word W 2 . First, one or more letters in the first word 
W| are changed to form a tentative word Wr, The 
/"i|r,n»-x ! «(r I | W + x J « B i|r J ) + » 10 changes may L made by, for example! aa?mg a ktterto 

the first word, deleting a letter from a first word, trans- 
posing two letters in the first word, or replacing a letter 
in a first word. 

The tentative word Wr is then compared to each 
word in a set of words (a spelling dictionary) L. If the 
tentative word Wr matches a word in the spelling dic- 
tionary L, then the second word W2 is set equal to the 
tentative word. 

FIG. 5 shows an alternative subroutine for changing 
the spelling of a word. In this routine, each word in the 
spelling dictionary is provided with an associated con- 
fusion group of words L f containing M different words. 
For example, each word in the confusion group may 
have a spelling which differs from the spelling of the 
first word Wj by no more than two letters. Alterna- 
Xi + x 2 + x 3 + X4 = 1 (7) tively, each word in a confusion group may be a word 

«x . , which s 0 " 110 ^ and is therefore confusable with the 

In equations (3M6), the count n^is the number of f irst wor d (for example, t4 to", "two", and "too", or 
occurrences of the trigram W,W,W,in a large body of "principle" and "principal"). For each candidate sen- 

rences of the bigram W*W, in the training text. Sum- Lf as the seC ond word W 2 . p 
W W "fi l^n^ZZ TTT"* °k thC r bigfam nG - 6 shows a edification of the spelling error 

of tlTS I y t T ber of occur ; detection co ™* io ° ***** <* nai t£ ste PS 

w d w Td n^ ^ iTJ^^T^/ 35 ^ m FIG 6 are Mtended * replace the steps in 
word W r , ana n is the total number of words in the 3:> m oc v 5 0 f pjo 1 v 

training text. The values of the coefficients Xi, X 2( X3, A ^ *' ^«^ s ^ ^ *u 

anH \* in pn.iQtirtnc m *«h n\ k- kT .u 1 According to the modification, the method further 

Lalit R. Bahl et al entitled "A Maximum Likelihood % ™7fV P t S £ I T / p^ D T wT g 
Approach to Continuous Speech Recognition" (IEEE « of WOrd * T '.^ ™ c P™*£ °J ^W) P <£i W /) 15 
r ra ^cr/o«5 on Pattern Analysis and Machine Intelli- compared with the product i P(W f )P(T/| W c ) If the for- 
gence, Vol. PAMI-5, No. 2, March 1983, pages nier is greater than or equal to the latter, a first output 
1 79-1 90). ( for exa mple, the input string) is produced. If the former 

In the comparison of P(W,)P(Ti/W,) with P(W C ) is less than the latter, then a second output (for example, 
P(T//W C ) the probability P(T//W C ) may be approxi- « the candidate string) is produced, 
mated by the product of the probabilities of misrepre- ^ *PP***txi& for detecting and correcting an error in 
senting each word in the candidate sentence as the cor- "^formation signal, for example where each informa- 
responding word in the original typed sentence. Where tion "S" 31 re P resents a w °rd having a spelling, is prefer- 
the original typed sentence and the candidate sentence abl y ™ ^ fonn of a programmed general purpose digi- 
difFer by only one word T 2 = W 2 in the original sentence 50 tal computer. FIG. 7 shows an example of the organiza- 
and W 2 in the candidate sentence), the probability tion of such an apparatus. 

P(T//W C ) can be estimated to be equal to the probability shown w FIG. 7, a word processor 10 provides an 

P(T 2 /W 2 ) of misspelling the second word as the first m ? u } str ™& of information signals T/=W/. Each infor- 
word. mation signal represents information, such as a word. 

The probability of misspelling any given word should 55 The word processor 10 is preferably a program running 
be estimated to have a low value, for example 0.001. on a central processor 12 which is also executing the 
This value has been determined by experiment to yield otner functions of the apparatus. However, word pro- 
satisfactory results. By increasing the probability of cessor 10 may alternatively be running on its own cen- 
misspelling, the invention will find more misspellings; tral processor. 

by decreasing the probability of misspelling, the inven- 60 Under the direction of the program instructions in 
tion will find fewer misspellings. When the word Wi in program instructions store 14, the central processor 12 
the original typed sentence has M misspellings which changes a first information signal Ti=Wi in the input 
result in correct dictionary entries, the probability of . string T/= W/ to form a second information signal W 2 
each misspelling becomes (0.00 1/M) in this example. representing information which is different from the 
If the probability P(T, | W,-) of correctly spelling all of 65 information represented by the first information signal, 
the words in the original typed sentence is not estimated This change forms a candidate string of information 
as 1, it may be approximated by the product of the signals W f , Under the direction of the program instruc- 
probabilities of correctly spelling each word in the tions, central process6r 12 compares the second infor- 
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mation signal W2 with the entries in the spelling dictio- 
nary store 16 to be sure that the second information 
signal is an entry in the spelling dictionary. 

Having produced the input and candidate strings, 
central processor 12 is instructed to retrieve estimates of 5 
the probabilities of occurrence of the input and candi- 
date strings from the word string probability store 18. 
The probability P(Ti|W c ) of misrepresenting the infor- 
mation represented by the candidate string of informa- 
tion signals as the input string of information signals is 10 
retrieved from store 20. Finally, central processor 12 
compares P(W/) with the product P(W c )P(Ti| W c ). A 
first output signal is sent to, for example, a display 22 if 
the former is greater than or equal to the latter. Other- 
wise, a second output signal is sent to the display 22. 15 



8 



W2 is "a", and the candidate word string W c (the candi- 
date sentence) is "a submit that is what is happening in 
this case." 

Table 1 shows the input and candidate sentences, the 
trigrams which make up each sentence, and the natural 
logarithms of the conditional probabilities for each tri- 
gram. The experiment was performed with four differ- 
ent values of the probability Pr of correctly spelling 
each word: P,=0.9999, P^0.999, P,=0.99, or P,=0.9. 

Since the logarithms (base e) of the probabilities are 
estimated in Table 1, the logarithms are added to pro- 
duce estimates of the product of the probabilities. 

Table 2 shows the totals obtained from Table 1. For 
all values of P/, the original sentence W, is selected over 
the alternative candidate sentence W c . 

TABLE 1 



Input 



Word 




Trigram 


COMPONENTS OF lnP(Ti/Wi) 


String 




Logarithm 


for 


for 


for 


for 


(Ti « Wi) 


Trigrams 


Probability 


Pt = 0.9999 


Pt = 0.999 


Pt = 0.99 


Pt = 0.9 


j 


j 


— 3.47634 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


submit 


_ 1 submit 


-8^7750 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


that 


] submit that 


-1.23049 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


is 


submit that is 


-4.74311 


-0,00010 


-0.00100 


-0.01005 


-0.10536 


what 


that is what 


-3.04882 


-0,00010 


-0.00100 


-0.01005 


-0.10536 


is 


is what is 


-3.07193 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


happening 


what is happening 


-4.86977 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


in 


is happening in 


-1.72564 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


this 


happening in this 


-3.84228 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


case 


in this case. 


-2.49284 


-0.00010 


-0.00100 


-0,01005 


-0.10536 




this case. 


-2.05863 


-0.00010 


-0.00100 


-0.01005 


-0.10536 




lnP(Wi) - 


-39.05735 












lnp(Ti/Wi) « 




-0.0011 


-0.0110 


-0.1106 


-1.1590 


Candidate 














Word 




Trigram 


COMPONENTS OF lnP(Ti/Wc) 


String 




Logarithm 


for 


for 


for 


for 


(Wc) 


Trigrams 


Probability 


Pt = 0.9999 


Pt = 0.999 


Pt = 0.99 


Pt = 0.9 


a 


a 


-3.96812 


-9.21034 


-6.90776 


-4.60517 


-2.30259 


submit 


_ a submit 


-10.20667 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


thai 


a submit that 


-3.69384 


' -0.00010 


-0.00100 


-0.01005 


-0.10536 


is 


submit that is 


-4.74311 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


what 


that is what 


-3.04882 


-0.000 10 


-0.00100 


-0.01005 


-a 10536 


is 


is what is 


— 3.07193 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


happening 


what is happening 


-4.B8977 


-0.00010 


-O.00100 


-0.01005 


-a 10536 


in 


is happening in 


-1.72564 


-0.00010 


-0.00100 


-0.01005 


-a 10536 


this 


happening in this 


-3.84228 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


case 


in this case 


-2.49284 


-0.00010 


-0.00100 


-0.01005 


-0.10536 




this case. 


-2.05863 


-0.00010 


-0.00100 


-0.01005 


-0.10536 




InPCWc) = 


-43.74165 












lnp(Ti/Wc) = 




-9.2113 


-6.9178 


-4.7057 


-3.3562 



The spelling error detection and correction method 
and apparatus according to the present invention were 
tested on 3,044 sentences which were systematically 50 
misspelled from 48 sentences. The 48 sentences were 
chosen from the Associated Press News Wire and from 
the Proceedings of the Canadian Parliament. Trigram 
conditional probabilities were obtained from a large 
corpus of text consisting primarily of office correspon- 55 
dence. Using a probability P(T/|W,) of 0,999, the 
method selected the changed sentence 78% of the time. 
Of those sentences that were changed, they were 
changed correctly 97% of the time. 

Several examples selected from the above-described 60 
tests are described below. 

EXAMPLE I 

In this example, the input word string (the original 
typed sentence) is "I submit that is what is happening in 65 
this case.'* The word W| whose spelling is being 
checked is "I". The word "I" has only the following 
simple misspelling: "a". Therefore, the second word 



TABLE 2 



Pi 


ln[P(Wi)P<Ti/Wj)3 


ln[P<Wc)P(Ti/Wc)] 


0.99990 


-39.05845 


-52.95299 


0.99900 


-39.06836 


-50.65941 


0.99000 


-39.16790 


-48.44732 


0.90000 


-40.21632 


-47.09784 



EXAMPLE II 

In this example, the input word string T/=W/ is: "I 
submit that is what is happening in this case". The first 
word Tj=Wi whose spelling is being checked is "sub- 
mit". The word "submit" has two simple misspellings: 
"summit" or "submits". In this example, the second 
word W2 is selected to be "summit". Therefore, the 
candidate word string W c (the candidate sentence) is "I 
summit that is what is happening in this case." 

Table 3 shows the logarithms of the probabilities, and 
Table 4 provides the totals for Table 3. Again, for each 
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value of P;, the original sentence is selected over the 
candidate. 

' TABLE 3 



10 



spelling is being checked is V 1 
following ten simple misspellings: 1 



T\ "at" 



"as" 



an 



Input 














Word 




lAjganinm 


COMPONENTS OF InPffi/Wtt 


String 




Ingram 


for 


for 


for 


for 


(Ti = WO 


Trigrams 


Probability 


Pt = 0.9999 


Pt = 0.999 


Pt = 0.99 


Pi m 0.9 


1 

submit 


I 


-3.47634 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


_ I submit 


-.8.47750 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


that 


I submit that 


-1.23049 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


is 


submit that is 


-4.74311 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


what 


that is what 


-3.04882 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


is 


is what is 


-3.07193 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


happening 


what is happening 


-4.88977 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


in 


is happening in 


-1.72564 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


this 


happening in this 


-3.84228 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


case 


in this case 


-2,49284 


-0.00010 


-0.00100 


-0.01005 


-0.10536 




this case. 


-2.05863 


-0.00010 


-0.00100 


-0.01005 


-0.10536 




InP(Wi) = 


-39.05735 










InpCn/Wi) - 




-0.0011 


-0.0110 


-0.1106 


-1.1590 



Candidate 














Word 




Trigram 


COMPONENTS OF lnPfTi/Wtf 


String 




Logarithm 


for 


for 


for 


for 


(Wc) 


Trigrams 


Probability 


Pi «s 0.9999 


Pt = 0.999 


Pt = 0.99 


Pt = 0.9 


1 


I 


-3.47634 


-0.00010 


-0.0010 


-0.01X5 


-0.10536 


submit 


_ I submit 


-18.48245 


-9.90349 


-7.60090 


-5.29832 


-2.99573 


thai 


I submit that 


-5.49443 


-0.00010 


-0.00100 


-0.0.10O5 


-0.10536 


is 


submit that is 


-3.50595 


-0.00010 


-0.00)00 


-0.0.10O5 


-0.10536 


what 


that is what 


-3.048B2 


-0.00010 


-0.00100 


-0,0.1005 


-0.10536 


is 


is what is 


-3.07193 


-0.00010 


-0.00100 


-0.0.1005 


-0.10536 


happening 


what is happening 


-4.88977 


-0.00010 


-0.00100 


-0.0.1005 


-0.10536 


in 


is happening in 


-1.72564 


-0.00010 


-0.00100 


-0.0.1005 


-0.10536 


this 


happening in this 


-3.84228 


-0.00010 


-0.00100 


-0.0.1005 


-0.10536 


case 


in this case 


-2.49284 


-0.00010 


-0.00100 


-0.0.1005 


-0.10536 




this case. 


-2.05863 


-0.00010 


-0.00100 


-0.0.1005 


-0.10536 




lnP(Wc) - 


-52.08908 












mpfTi/Wc) « 




-9.9045. 


-7.6109 


-5.3988 


-4.0493 



TABLE 4 



35 



Pi 


ln[P(Wi)P(Ti/Wi)] 


ln[P(Wc)P(Ti/Wc)] , 


0.99990 


-39.05845 


-61.99357 


0.99900 


-39.06836 


-59.69999 


0.99000 


-39.16790 


-57.48790 


0.90000 


-40.21632 


-56.13842 



EXAMPLE III 

In this example, the input word string T/=W,- (the 
original typed sentence) is now "a submit that is what is 
happening in this case." The first word T]=Wj whose 



"am", "ad", "ab", "pa", "or", "ha". 

A second word W 2 is selected to be *T\ Therefore, 
the candidate string is "I submit that is what is happen- 
ing in this case." 

The logarithms of the individual probabilities are 
shown in Table 5. Note that the probability P(Ti | W2) is 
equal to (P//M) (where M equals 10.) 

Table 6 provides the totals from Table 5. For all 
values of P,, except P,=0.9, the original sentence is 
selected over the candidate. When P/=0.9, the candi- 
date is selected over the original. 



TABLE 5 



Input 














Word 




Trigram 


COMPONENTS OF InPm/Wi) 


String 




Logarithm 


for 


for 


for 


for 


(Ti - Wi) 


Trigrams 


Probability 


Pt = 0,9999 


Pt « 0.999 


Pt = 0.99 


Pt~0.9 


a 


a 


-3.96812 


-0.00010 


-0.00100 


-0.01005. 


-0.10536 


submit 


_ a submit 


-10.20667 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


that 


a submit that 


-3.69384 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


is 


submit that is 


-4.74311 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


what 


that is what . 


-3.04882 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


is 


is what is 


-3.07193 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


happening 


what is happening 


-4.88977 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


in 


is happening in 


-1.72564 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


this 


happening in this 


-3.84228 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


case 


in this case 


-2.49284 


-0.00010 


-0.00100 


-0.01005 


-0.10536 




this case. 


-2.05863 


-0.00010 


-0.00100 


-0.01005 


-0.10536 




lnP(Wi) = 


-43.74165 










mp<Ti/Wi) = 




-0.00)1 


-0.0110 


-0.1106 


-1.1590 


Candidate 














Word . 




Logarithm 


COMPONENTS OF lnP(Ti/Wc) 


String. 




Trigram 


for 


for 


for 


for 


(Wc) 


Trigrams 


Probability 


Pt - 0.9999 


Pt = 0.999 


Pt = 099 


Pt =0.9 


submit 


__1 


-3.47634 


-11.51293 


. -9.21034 


-6.90776 


-4.60517 


_ 1 submit 


-8.47750 


-0.00010 


-0.00100 


-0.01005 


-0.10536 
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TABLE 5-continued * 



that 

inai 


I submit that 


— 1.23049 


—0.00010 


—0.00100 


—0.01005 


—0.10536 


is 


submit that is 


-4.74311 


-0.00010 


-0.00100 


-0.01005 


-tt 10536 


what 


that is what 


■ -3.04883 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


is 


is what is 


-3.07193 


-0.00010 


-0.00100 


-0.01005 


-a 10536 


happening 


what is happening 


-4.88977 


-0.00010 


-O.OOIOO 


-0.01005 


-a 10536 


in 


is happening in 


-1.72564 


-0.00010 


-0.00100 


-0.01005 


-a 10536 


this 


happening in this 


-3.64228 


-0.00010 


-0.00100 


-0.01005 


-a 10536 


case 


in this case 


-2.49284 


-0,00010 


-0.00100 


-0.01005 


-0.10536 




this case. 


-2.05863 


-0.00010 


-0.00100 


-0.01005 


-0.10536 




InPfWc) » 


-39.05735 












InpOVWc) = 




-11.5139 


-9.2203 


-7.0083 


-5.6588 



original typed sentence. A correction is therefore made 
in all cases. 

TABLE 7 



Input 

Word Ingram COMPONENTS OF InPOVWO 



String 




Logarithm 


for 


for 


for 


for 


(Ti = Wi) 


Tri grains 


Probs. 


Pi = 0.9999 


Pt « 0.999 


Pt = 0.99 


Pt-0.9 


1 


j 


— i.t/Oi*f 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


submit 


_ I submit 


-18.48245 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


that 


1 submit that 


-5.49443 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


is 


submit that is 


-4.74311 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


what 


that is what 


-3.04882 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


is 


is what is 


-3.07193 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


happening 


what is happening 


-4.88977 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


in 


is happening in 


-1.72564 


-0,00010 


-0.00100 


-0.01005 


-0.10536 


this 


happening in this 


-3.84228 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


case 


in this case 


-2.49284 


-0.00010 


-0.00100 


-0.01005 


-0.10536 




this case. 


-2.05863 


-0.00010 


-0.00100 


-0.01005 


-0.10536 




lnP(Wi) s 


-52.08908 












lnp(Ti/Wi) = 




-0.0011 


-0.0110 


-0.1106 


-1.1590 


Candidate 














Word 




Logarithm 


COMPONENTS OF lnP(Ti/Wc) 


String 




Trigram 


for 


for 


for 


for 


(Wc) 


Trigrams 


Probability 


Pi = 0.9999 


Pt « 0.999 


Pt « 0.99 


Pt - 0.9 


I 


__I 


-3.47634 


-11.51293 


-9.21034 


-6.90776 


-4.60517 


submit 


_ I submit 


-8.47750 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


that 


I submit that 


-1.23049 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


is 


submit that is 


-4.74311 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


what 


that is what 


-3.04882 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


is 


is what is 


-3.07193 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


happening 


what is happening 


-4.88977 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


in 


is happening in 


-1.72564 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


this 


happening in this 


-3.84228 


-0.00010 


-0.00100 


-0.01005 


-0.10536 


case 


in this case 


-2.49284 


-0.00010 


-0.00100 


-0.01005 


-0.10536 




this case. 


-2.05863 


-0.00010 


-0.00100 


-0.01005 


-0.10536 




lnP(Wc) = 


-39.05735 












lnp(Ti/Wc) - 




-9.9045 


-7.6109 


-5,3988 


-4.0493 



Pt 1n[P(Wi)P(Ti/Wi)] ln[P(Wc)P(Ti/Wc)] 



0.99990 


-43.74275 


-50.57128 


0,99900 


-43.75266 


-48.27770 


0.99000 


-43.85220 


-46.06561 


0.90000 


-44.90062 


-44.71613 



EXAMPLE IV 

In this example, the input word string T/= W/ is "I 55 
summit that is what is happening in this case." The first 
word Ti=Wi whose spelling is being checked is "sum- 
mit". The word "summit" has two simple misspellings: 
"submit" or "summit". 

The second word W2 is selected to be "submit". 60 
Therefore, the candidate word string W f is "I submit 
that is what is happening in this case." 

Table 7 shows the logarithms of the estimated proba- 
bilities of the trigrams and of correctly spelling or incor- 
rectly spelling each word. Since M=2, the probability 65 
P(T,|W 2 )=(P,/2). 

Table 8 provides the totals from Table 7. For all 
values of P/ ( the candidate sentence is selected over the 



TABLE 8 



Pt 


ln[P(Wi)P(Ti/Wi)] 


lnrp(Wc)P(Ti/Wc)) 


0.99990 


-52.09018 . 


-48.96184 


0.99900 


-52.10009 


-46.66826 


0.99000 


-52.19963 


-44.45617 


0.90000 


-53.24805 


-43.10669 



We claim: 

1. A spelling error detection method, said method 
comprising the steps of; 
providing an input string of words T/=W/ produced 

by a word processor, each word having a spelling; 
changing the spelling of a first word Tj=Wi in the 

input string to form a second word W2 different 

from the first word; 
replacing the first word Wj in the input string W/ 

with the second word W2 to form a candidate 

string of words W c ; 
estimating the probability P(W,-) of occurrence of the 

input string of words; 
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estimating the probability P(W C ) of occurrence of the 
candidate string of words; 

estimating the probability P(T/|W f ) of misrepresent- 
ing the candidate stripg of words W f as the input 
string of words T,; 5 

comparing P(W/) with the product P(W C )P(T/| W c ); 
and 

outputting the input string of words if P(W/) is 
greater than P(W C )P(T/|W C ) P or outputting an 
error indication if P(W,) is less than 10 
P(W C )P(T/|W C ). 

2. A method as claimed in claim 1, characterized in 
that: 

the second output comprises the candidate string of 
words; and 15 

the probability P(T, | W c ) is estimated as the probabil- 
ity P(T| I W2) of misspelling the second word Wjas 
the first word Ti. 

3. A method as claimed in claim 2, characterized in 
that; 20 

the method further comprises the step of providing a 
set of words, each word having a spelling; 

each word in the input string of words is a member of 
the set of words; and 

the second word W2 is a member of the set of words. 25 

4. A method as claimed in claim 3, characterized in 
that: 

the method further comprises the step of estimating 
the probability P(T/| W/) of correctly spelling all of 
the words in the input string of words W,- 30 

the step of comparing comprises comparing the prod- 
uct P(W/)P(T/ 1 W/) with the product; and 

the step of outputting comprises outputting the first 
output if P(W/)P(T/| W,) is greater than, or output- 
ting the second output if P(Wi)P(T, | W/) is less than 35 
P(W C )PCI7|W C ). 

5. A method as claimed in claim 4, characterized in 
that the probability P(T| | W,) is estimated as the proba- 
bility P(Ti I Wi) of correctly spelling the first word Ti. 

6. A method as claimed in claim 5, characterized in 40 
that the step of changing the spelling of the first word 
Wi to form the second word W2 comprises: 

adding a letter to the first word to form a tentative 
word; 

comparing the tentative word to each word in the set 45 

of words; and 
using the tentative word as the second word W2if the 

tentative word matches a word in the set of words. 

7. A method as claimed in claim 5, characterized in 
that the step of changing the spelling of the first word 50 
Wj to form the second word W2 comprises: 

deleting a letter from the first word to form a tenta- 
tive word; 

comparing the tentative word to each word in the set 
of words; and 55 

using the tentative word as the second word W2if the 
tentative word matches a word in the set of words. 

8: A method as claimed in claim 5, characterized in 
that: 

the first word comprises at least two letters; and 60 
the step of changing the spelling of the first word Wi 

to form the second word W2 comprises: 
transposing at least two letters in the first word to 

form a tentative word; 
comparing the tentative word to each word in the set 65 

of words; and 
using the tentative word as the second word W2if the 

tentative word matches a word in the set of words. 



14 



9. A method as claimed in claim 5, characterized in 
that: 

the first word comprises at least one letter; and 
the step of changing the spelling of the first word Wi 

to form the second word W2 comprises: 
replacing a letter in the first word with a different 

letter to form a tentative word; 
comparing the tentative word to each word in the set 

of words; and 
using the tentative word as the second word W2if the 

tentative word matches a word in the set of words. 

10. A method as claimed in claim 5, characterized in 
that the step of changing the spelling of the first word 
Wj to form the second word W2 comprises: 

identifying a confusion group of M different words in 
the set of words, each word in the confusion group 
having a spelling which differs from the spelling of 
the first word by no more than two letters; and 

selecting one word in the confusion group as the 
second word 

11. A method as claimed in claim 5, characterized in 
that the step of changing the spelling of the first word 

to form the second word W2 comprises: 
identifying a confusion group of M different words in 
the set of words, each word in the confusion group 
being confusable with the first word; and 
selecting one word in the confusion group as the 
second word W2. 

12. A method as claimed in claim 11, characterized in 
that: 

the probability P(Ti | Wj) is estimated to be 0.999; and 
the probability P(T/|W f ) is estimated to be 
(0.001/M). 

13. A method of detecting an error in an information 
signal, said method comprising the steps of: 

providing an input string of information signals 
T/= W/, each information signal representing infor- 
mation; 

changing a first information signal Ti=Wj in the 
input string to form a second information signal 
W2 representing information different from the 
information represented by the first information 
signal; 

replacing the first information signal Wj in the input 
string W, with the second information signal W2to 
form a candidate string of information signals W f ; 

estimating the probability P(W/) of occurrence of the 
input string of information signals; 

estimating the probability P(W C ) of occurrence of the 
candidate string of information signals; 

estimating the probability P(T/| W c ) of misrepresent- 
ing the information represented by the candidate 
string of information signals W c as the input string 
of information signals T/; 

comparing P(W,) with the product P(W C )P(T/| W e ); 
and 

outputting the input string of information signals if 
P(W/) is greater than P(W f )P(T/|W c ), or output- 
ting an error indication signal if P(W,) is less than 
POVcJPCT.IW,). 

14. A method as claimed in claim 13, characterized in 
that: 

the second output signal comprises the candidate 
string of information signals; and 

the probability P(T/| W c ) is estimated as the probabil- 
ity P(Ti|W2) of misrepresenting the information 
represented by the second information signal W2 as 
the first information signal Tj. 
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16 



20 



15. A method as claimed in claim 14, characterized in 
that: 

the method further comprises the step of providing a ' 
set of words, each word having a spelling; 

each information signal in the input string of informa- 5 
tion signals represents a word which is a member of 
the set of words; and 

the second information signal W2 represents a word 
which is a member of the set of words, the word 
represented by the second information signal being 10 
different from the word being represented by the 
first information signal. 

16. A method as claimed in claim 15, characterized in 
that: 

the method further comprises the step of estimating 15 
the probability P(T/|W,) of correctly representing 
the information represented by all of the informa- 
tion signals in the input string of information sig- 
nals W fi 

the step of comparing comprises comparing the prod- 
uct P(W|)P(T,|W,) with the product 
P(W C )P(T,|W C ); and 

the step of outputting comprises outputting the first 
output signal if P(W/)P(T/|W/) is greater than 
POV^PCT/jWc), or outputting the second output 
signal if P(W,)P<T/|W,) is less than 
P(W f )P<T/|W,). 

17. A method as claimed in claim 16, characterized in 
that: the probability P(T, | W/) is estimated as the proba- ^ 
bility P(Ti| Wi) of correctly representing the informa- 
tion represented by the first information signal Ti. 

18. A method as claimed in claim 17, characterized in 
that the step of changing the first information signal Wi 

to form the second information signal W2 comprises: 35 
adding a letter to the word represented by the first 

information signal to form a tentative word; 
comparing the tentative word to each word in the set 

of words; and 
representing the tentative word as the second infor- 

mation signal W2 if the tentative word matches a 

word in the set of words. 

19. A method as claimed in claim 17, characterized in 
thai the step of changing the first information signal Wj 

to form the second information signal Wj comprises: 45 
deleting a letter from the word represented by the • 

first information signal to form a tentative word; 
comparing the tentative word to each word in the set 

of words; and 

representing the tentative word as the second infor- 50 
mation signal W2 if the tentative word matches a 
word in the set of words. 

20. A method as claimed in claim 17, characterized in 
that: 

the first information signal represents a word having 55 
at least two letters; and 

the step of changing the first information signal Wj to 
form the second information signal W2 comprises: 

transposing at least two letters in the word repre- 
sented by the first information signal to form a 60 
tentative word; 

comparing the tentative word to each word in the set 
of words; and 

representing the tentative word as the second infor- 
mation signal W2 if the tentative word matches a 65 
word in the set of words. 

21. A method as claimed in claim 17, characterized in 
that: 



the first information signal . represents a word having 
at least one letter; and 

the step of changing the first information signal Wj to 
form the second information signal W2 comprises: 

replacing a letter in the word represented by the first 
information signal to form a tentative word; 

comparing the tentative word to each word in the set 
of words; and 

representing the tentative word as the second infor- 
mation signal W2 if the tentative word matches a 
word in the set of words. 

22. A method as claimed in claim 17, characterized in 
that the step of changing the first information signal Wi 
to form the second information signal W2 comprises: 

identifying a confusion group of M different words in 
the set of words, each word in the confusion group 
having a spelling which differs from the spelling of 
the word represented by the first information signal 
by no more than two letters; and 

representing one word in the confusion group as the 
second information signal W2. 

23. A method as claimed in claim 17, characterized in 
that the step of changing the first information signal Wi 
to form the second information signal W2 comprises: 

identifying a confusion group of M different words in 
the set of words, each word in the confusion group 
being confusable with the word represented by the 
first information signal; and 

representing one word in the confusion group as the 
second information signal W2. 

24. A method as claimed in claim 23, characterized in 
that: 

the probability P(Ti | Wj) is estimated to be 0.999; and 
the probability P(T/|W C ) is estimated to be 
(0.001/M). 

25. An apparatus for detecting an error in an informa- 
tion signal, said apparatus comprising: 

means for providing an input string of information 
signals T,— W,, each information signal represent- 
ing information; 

means for changing a first information signal Ti =W| 
in the input string to form a second information 
signal W2 representing information different from 
the information represented by the first informa- 
tion signal; 

means for replacing the first information signal W] in 
the input string with the second information signal 
W*2 to form a candidate string of information sig- 
nals W c ; 

means for estimating the probability P(W/) of occur- 
rence of the input string of information signals; 

means for estimating the probability P(W C ) of occur- 
rence of the candidate string of information signals; 

means for estimating the probability P(Tj|W c ) of 
misrepresenting the information represented by the 
candidate string of information signals W f as the 
input string of information signals T5 

means for comparing P(W/) with the product 
P(W C )P<T,-|W C ); and 

means for outputting the input string of information 
signals if P(W/) is greater than P(W C )P(T,| W c ), or 
outputting an error indication signal if P(W/) is less 
than P(W C )P(T,-|W C ). 

26. An apparatus as claimed in claim 25, character- 
ized in that: 

the second output signal comprises the candidate 
string of information signals; and 
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the probability P(T/| W c ) is estimated as the probabil- 
ity P(Ti|W2) of misrepresenting the information 
represented by the second information signal W2 as 
the first information signal TV 

27. An apparatus as claimed in claim 26, character- 5 
ized in that: 

the apparatus further comprises dictionary means for 
storing a set of words, each word having a spelling; 

each information signal in the input string of in forma- \q 
tion signals represents a word which is a member of 
the set of words; and 

the second information signal represents a word 
which is a member of the set of words, the word 
represented by the second information signal being 15 
different from the word being represented by the 
first information signal. 

28. An apparatus as claimed in claim 27, character- 
ized in that: 

the apparatus further comprises means for estimating 
the probability P(T/|Wj) of correctly representing 
all of the information represented by the input 
string of information signals W,; 

the means for comparing comprises means for com- 25 
paring the product P(W,)P(T/| W,) with the prod- 
uct P(W C )P(T/|W C ); and 

the means for outputting comprises means for output- 
ting the first output signal if P(Wi)P(T,|W/) is 
greater than P(W C )P(T/| W c ), or outputting the 
second output signal if P(W;)P(T,-| W,) is less than 
P(W C )P(T/|W C ). 

29. An apparatus as claimed in claim 28, character- 
ized in that: the probability P(T/| W/) is estimated as the 35 
probability P(Ti|W0 of correctly representing the 
information represented by the first information signal 
W,. 

30. An apparatus as claimed in claim 29, character- 
ized in that the means for changing the first information 40 
signal Wi to form the second information signal W2 
comprises: 

means for adding a letter to the word represented by 
the first information signal to form a tentative 45 
word; 

means for comparing the tentative word to each 
word in the set of words; and 

means for representing the tentative word as the sec- 
ond information signal W2 if the tentative word 50 
matches a word in the set of words. 

31. An apparatus as claimed in claim 29, character- 
ized in that the means for changing the first information 
signal W| to form the second information signal W 2 
comprises: 

means. for deleting a letter from the word represented 
by the first information signal to form a tentative 
word; 

means for comparing the tentative word to each 60 
word in the set of words; and 
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means for representing the tentative word as the sec- 
ond information signal W2 if the tentative word 
matches a word in the set of words. 

32. An apparatus as claimed in claim 27, character- 
ized in that 

the first information signal represents a word having 
at least two letters; and 

the means for changing the first information signal 
Wi to form the second information signal W2 com- 
prises: 

means for transporting at least two letters in the word 
represented by the first information signal to form 
a tentative word; 

means for comparing the tentative word to each 
word in the set of words; and 

means for representing the tentative word as the sec- 
ond information signal W2 if the tentative word 
matches a word in the set of words. 

33. An apparatus as claimed in claim 29, character- 
ized in that: 

the first information signal represents a word having 
at least one letter; and 

the means for changing the first information signal 
Wj to form the second information signal W2 com- 
prises: 

means for replacing a letter in the word represented 
by the first information signal to form a tentative 
word; 

means for comparing the tentative word to each 
word in the set of words; and 

means for representing the tentative word as the sec- 
ond information signal W2 if the tentative word 
matches a word in the set of words. 

34. An apparatus as claimed in claim 29, character- 
ized in that the means for changing the first information 
signal Wi to form the second information signal W2 
comprises: 

means for identifying a confusion group of M differ- 
ent words in the set of words, each word in the 
confusion group having a spelling which differs 
from the spelling of the word represented by the 
first information signal by no more than two letters; 
and 

means for representing one word in the confusion 
group as the second information signal W2. 

35. An apparatus as claimed in claim 29, character- 
ized in that the means for changing the first information 
signal Wi to form the second information signal W2 
comprises: 

means for identifying a confusion group of M differ- 
ent words in the set of words, each word in the 
confusion group being confusable with the word 
represented by the first information signal; and 

means for representing one word in the confusion 
group as the second information signal WV 

36. A method as claimed in claim 35, characterized in 
that: 

the probability P(Ti | W0 is estimated to be 0.999; and 
the probability P(T, | W c ) is estimated to be 
(0.001/M). 
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